Let’s begin by loading up the pre-calculated files so we are more resource friendly and can do everything a lot faster.

What do we have?

xrange: the x-axis range for our hypothesis space: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

yrange: the y-axis range for our hypothesis space: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

alphas: the values of alpha I precalculated: -5, -2, -1, -0.5, -0.1, 0, 0.1, 0.5, 1, 2, 5

hyp: all possible hypotheses in this space: there are nHyp = 3025

pts: all of the points in this space: there are nPts = 100

posProbPts: a 3d array of size nPts x nHyp x length(alphas). This calculating the probability of generating each point assuming it is positive given each hypothesis and each alpha. This is all possible likelihoods (for the positive observations) we need to know.

negProbPts: a 3d array of size nPts x nHyp x length(alphas). This calculating the probability of generating each point assuming it is negative given each hypothesis and each alpha. This is all possible likelihoods (for the negative observations) we need to know.

allProbPts: a 3d array of size nPts x nHyp x length(alphas). This calculating the probability of generating each point whether it is positive or negative (i.e. with both of those summed together), given each hypothesis and each alpha. This is all possible likelihoods (for both positive and negative observations) we need to know.

consPts: a 2d array of size nPts x nHyp. For each point and hypothesis, this is TRUE if that point is contained within that hypothesis and FALSE otherwise. Useful as a quick lookup.

The true hypothesis

Let’s generate the true hypothesis

Common data structures

pts: the set of points tracked by an agent. Each row is a point. The columns are: x (the x coordinate of that point); y (the y coordinate of that point); selected (TRUE if the teacher has provided that point; initialised to FALSE); type (“positive” or “negative”, initialised to NA since no points have been seen at the start); prior (the prior probability of each point, initialised to be uniform); and posterior (posterior probability of each point, initialised to be uniform). As the agent sees new points, they update selected, type, and posterior.

hyps: the set of hypotheses tracked by an agent Each row is a hypothesis. The columns are: x1,y1,x2,y2 (the coordinates of that hypothesis) size (the size of that hypothesis); negSize (the size of the space around the hypothesis); prior (their priors over hypotheses, initialised to be uniform); nPos and nNeg (the number of positive and negative points seen so far; initialised to zero); consPos and consNeg (TRUE if the positive/negative points seen so far are all consistent with that hypothesis; initialised to TRUE because none have been seen so far); likePos and likeNeg (the probability of having seen those positive/negative datapoints given that hypothesis. zero if they are not consistent); posterior (posterior probability of that hypothesis, based on combining the prior with likePos and likeNeg). As the agent sees new points, they update everything after the prior column.

Now let’s look at different teachers

The teachers can vary according to two parameters:

tchAlpha: the alpha value that the teacher has. We can vary these systematically so we consider the following values: -1, 0, 1

tchLnAlpha: the alpha value the teacher assumes the learner has about them. We can vary these systematically so we consider the following values: -1, 0, 1

Teacher priors

First we want to capture the different priors they might have about what the learners expect to see. This will depend on their value of tchLnAlpha. We consider three different sets of points (each is a data structure of pts):

tchPtsLnD: this corresponds to the points the teacher is tracking to model the learner, assuming that the learner thinks the teacher is being deceptive (alpha=-1)

tchPtsLnR: this corresponds to the points the teacher is tracking to model the learner, assuming that the learner thinks the teacher is being random (alpha=0)

tchPtsLnH: this corresponds to the points the teacher is tracking to model the learner, assuming that the learner thinks the teacher is being deceptive (alpha=1)

Teacher sampling

Let’s look at how teachers sample depending on their alpha and their assumptions about the learner’s alpha.

Now let’s look at different learners

From the learner end, they don’t know what the hypotheses are, but instead they make different inferences based on different data, so that’s what we want to vary. There are a few factors here.

First we’ll look at the ones that don’t do the extra recursive layer of inference.

The plots are saved individually (and therefore are easier to see) in figures/testsims/

Non-recursive: Deceptive

p1D

p2D

p3D

p4D

Non-recursive: Helpful

p1H

p2H

p3H

p4H

Non-recursive: Helpful

p1W

p2W

p3W

p4W

Recursive: Deceptive, assumes deceptive (i.e., uninformative)

p1DD1

p2DD1

p3DD1

p4DD1

Recursive: Deceptive, assumes helpful (i.e., misleading)

p1DH1

p2DH1

p3DH1

p4DH1